Circuit of on-chip network having four-node ring switch structure

ABSTRACT

A hierarchical ring architecture is constructed with on-chip networks. The on-chip network includes two type-0 ring nodes and two type-1 ring nodes. The present invention provides multiple data transfer in parallel between multiple processor cores or multiple function units and register banks with dynamic configuration. The present invention thus obtains a low control complexity, an optimized local bandwidth, an optimized remote node path, a low routing complexity and a simplified circuit.

FIELD OF THE INVENTION

The present invention relates to an on-chip network; more particularly, relates to providing an optimized system-on-chip (SOC) design integrating multiple processor cores and IPs.

DESCRIPTION OF THE RELATED ART(S)

A multi-core processor or a single core processor with multiple function units is on the hit. Communications between those components (multi-cores or multiple function units) are important. Yet, traditional single channel bus is not enough for supporting high bandwidth and high efficiency multi-components communications. Thus, some other bus, like crossbar bus, multi-layer bus or multi-node ring bus, enhances bandwidth and efficiency. However, the path collision, control complexity, area overhead and power consumption are not easily optimized. In this patent, we target on improving ring-based on-chip networks. About ring-based interconnect design, the increasing of ring node number will heighten design complexity, as arbitration, buffer usage, path optimization . . . etc. Especially in arbitration, the complexity of arbiter will increase exponentially by increasing ring nodes. More ring nodes share one ring path will limit bandwidth and increase traffic. However, multiple ring nodes sharing multiple ring paths aren't easy to design.

In order to solve this problem, the patent takes design policy as providing a simple/efficient ring structure and a hierarchical construction. It proposes four-node ring switch structure with hierarchical ring to construct ring-based interconnects.

SUMMARY OF THE INVENTION

The main purpose of the present invention is to provide multiple data transfer in parallel with dynamic configuration between multiple processor cores or multiple function units and registers for obtaining a low control complexity, an optimized local bandwidth, an optimized remote node path, a low routing complexity and a simplified circuit.

To achieve the above purposes, the present invention is a circuit of an on-chip network having a four-node ring switch structure, comprising two type-0 ring nodes and two type-1 ring nodes for dual-directional data transfer, where the type-0 ring nodes comprises three data input ports, three data output ports and five data transferring lines (as shown in FIG. 1B); the type-0 ring node transfers and switches data; two pairs of the data input ports and the data output ports of the type-0 ring node are left side connection and right side connection data transfers; the other pair of the data input port and the data output port of the type-0 ring node are input interface and output interface data transfers; the type-1 ring node comprises four data input ports, four data output ports and nine data transferring lines (as shown in FIG. 2B); the type-1 ring node transfers and switches data; two pairs of the data input ports and the data output ports of the type-1 node are left side connection and right side connection data transfers; another pair of the data input port and the data output port of the type-1 ring node are cross connection data transfers in the four-node ring switch structure; and the other pair of the data input port and the data output port of the type-1 ring node are input interface and output interface data transfers. Accordingly, a novel circuit of an on-chip network having a four-node ring switch structure is obtained.

The proposed four-node ring switch structure with two directions ring paths (one is clockwise, another is counterclockwise) supports efficient parallel data access which proving maximum local communication bandwidth. The proposed four-node ring switch structure contains simple arbiter, high bandwidth, easy replication and linking properties. The hierarchical constructing four nodes ring structures provides low-cost, flexible and large scalar design solution. Distributed arbiters are provided, and the complexity won't be increased greatly by ring nodes.

Overall, this patent provides an efficient parallel data access, low-cost, and flexible solution in multi-components communications.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be better understood from the following detailed description of the preferred embodiment according to the present invention, taken in conjunction with the accompanying drawings, in which

FIG. 1A and FIG. 1B are the views showing the type-0 ring node for dual-directional data transfer according to the preferred embodiment of the present invention;

FIG. 2A and FIG. 2B are the views showing the type-1 ring node for dual-directional data transfer;

FIG. 3A and FIG. 3B are the views showing the preferred embodiment;

FIG. 4A to FIG. 4E are the views showing the applications of the type-0 ring node;

FIG. 5A to FIG. 5E are the views showing the applications of the type-1 node;

FIG. 6A to FIG. 6F are the views showing data transference between the four nodes;

FIG. 7 is the view showing the hierarchical ring architecture having the two layers;

FIG. 8 is the view showing the two data transferring path in the hierarchical ring architecture; and

FIG. 9 is the view showing the hierarchical ring architecture combining with multiple function units and multiple register banks application.

DESCRIPTION OF THE PREFERRED EMBODIMENT

The following description of the preferred embodiment is provided to understand the features and the structures of the present invention.

Please refer to FIG. 1A, FIG. 1B, FIG. 2A and FIG. 2B, which are views showing a type-0 ring node and a type-1 ring node for dual-directional data transfer according to a preferred embodiment of the present invention. As shown in the figures, the present invention is a circuit of an on-chip network having a four-node ring switch structure, comprising two type-0 ring nodes 11 and two type-1 ring nodes 12 for dual-directional data transfer.

The type-0 ring node 11, as shown in FIG. 1A, comprises three data input ports and three data output ports for transferring and switching data; and five data transferring lines (as shown in FIG. 1B), where two pairs of the data input ports and the data output ports are left side connection and right side connection data transfers and the other pair of the data input port and the data output port are input interface and output interface data transfers. The type-0 ring node 11 has a cut path to prevent infinite loop. And, because the present invention does not need lines for left-in-left-out and right-in-right-out data transference, original nine (3×3) data transferring lines required are reduced to five data transferring lines.

The type-1 ring node 12, as shown in FIG. 2A, comprises four data input ports and four data output ports for transferring and switching data; and nine data transferring lines (as shown in FIG. 2B), where two pairs of the data input ports and the data output ports are left side connection and right side connection data transfers; another pair of the data input port and the data output port are cross connection data transfers; and the other pair of the data input port and the data output port are input interface and output interface data transfers. The type-1 ring node 12 has a cross path to reach a destination which is not reachable by the cut path. Thus, the type-1 ring node 12 supports lines for left-in-right-out and right-in-left-out data transference by connecting cross input port and data output port and by connecting cross output port and data input port. And, because the present invention does not need lines for left-in-left-out and right-in-right-out data transference, original sixteen (4×4) data transferring lines required are reduced to nine data transferring lines.

Please refer to FIG. 3A and FIG. 3B, which are views showing a preferred embodiment. As shown in the figures, the present invention comprises two type-0 ring nodes 11 and two type-1 ring nodes 12 for dual-directional data transfer, where the type-0 ring node 11 has a cut path to prevent infinite loop and the type-1 node 12 has a cross path to reach a destination which is not reachable by the cut path. As shown in FIG. 3A, a four-node ring network 1 of the present invention has two reversed directional circle paths running through four nodes; and cross paths between two corresponding nodes. In this way, peer-to-peer access is done by following certain paths. Since the four-node ring network is able to access data at the four nodes without collision with the reversed directional circle paths and the cross paths, a best local bandwidth and a simple routing is obtained.

As shown in FIG. 3B, a controller is used to control the four-node ring network 1. Because four nodes in the four-node ring network 1 only allow different destinations at the same time, so the complexity on arbiter calculating is under 4!, which show low arbitration complexity property. In addition, the transferring data to a destination by ring path so that routing complexity of multi-peer to multi-peer is reduced, which is a low routing complexity.

Please refer to FIG. 4A to FIG. 4E and FIG. 5A to FIG. 5E, which are views showing applications of type-0 ring nodes and type-1 ring nodes. As shown in the figures, a type-0 ring node 11 for dual-directional data transfer used in a four-node ring network 1 has a cut path so that a switching between data from left to right connection and data from right to left connection is not available. As shown in FIG. 4B, first data 111 is transferred from a second node (node 1) to a first node (node 0); yet a transferring path from the second node (node 1) then a third node (node 2) then a fourth node (node 3) to the first node (node 0) is not possible owing to the cut path of the type-0 nodes 11. As shown in FIG. 4D, second data 112 is transferred from the fourth node then the third node to the second node. Therein, the first data and the second data both pass through the same third node by two reversed directional circle paths. Concerning FIG. 4C and FIG. 4E, transferring paths are understood following the same principle as is described above.

As shown in FIG. 5A to FIG. 5E, a type-1 ring node 12 for dual-directional data transfer used in the four-node ring network 1 has a cross path to obtain paths which is not available through the cut path. Opposite nodes in the four-node ring network 1 do not transfer data without the cross path because of the cut path of the type-0 ring nodes—the second node and the fourth node. As shown in FIG. 5B, third data 121 is transferred from the first node to the third node through the cross path. And fourth data 122 is transferred from the third node to the second node. Concerning FIG. 5C and FIG. 5E, transferring paths are understood following the same principle as is described above.

Please refer to FIG. 6A to FIG. 6F, which are views showing data transference between four nodes. As shown in the figures, one of four-factorial conditions are shown, where corresponding paths for simultaneously transferring data to different destinations. The path passes through at most three nodes between a starting node and an ending node in a four-node ring switch structure, which is an optimized path for the four-node ring switch structure having two reversed directional circle paths.

Please refer to FIG. 7 to FIG. 9, which are views showing a hierarchical ring architecture having two layers; there is example of two data transferring paths cross top-layer in the hierarchical ring architecture; and the hierarchical ring architecture combining with multiple function units and multiple register banks application. As shown in the figures, a four-node ring network 1 has four nodes for accessing data. Hierarchical ring architecture is thus provided to extend a circuit of multi-node on-chip network. The hierarchical ring architecture is based on the four-node ring network 1 and comprises four four-node ring networks 1 connected with one four-node ring network 1. The hierarchical ring architecture having two layers of the four-node ring networks provides 16 nodes for components mounting. If more nodes are required, a further layer is added to provide 64 nodes, which best integrates a great number of nodes.

Two paths passing through six and eight nodes separately are shown in FIG. 8, where the hierarchical ring architecture of four four-node ring networks 1. Hence, a communication interface of the hierarchical ring architecture needs only an additional buffer to achieve a low control complexity and minimize a global control. Thus, in the hierarchical ring architecture of four-node ring networks 1, routing complexity between multiple nodes is reduced by ring paths for reaching targets and by lesser routings at higher layer of the hierarchical ring architecture. Consequently, low routing complexity is obtained for optimizing a practice of an on-chip network.

As shown in FIG. 9, a hierarchical ring architecture of four-node ring networks connects eight function units 5 and sixteen register banks 4 for multithreading multiple issues processor, where data are rapidly exchanged in between multiple function units and register banks, thus performance is greatly enhanced and monolithic register file cost are reduced.

As a conclusion, the present invention has the following advantages:

1. A four-node ring network according to the present invention provides four nodes to access data simultaneously with only two types of ring nodes and two reversed directional ring paths for obtaining a wide bandwidth, a low cost as well as simplicity.

2. A hierarchical ring architecture of the four-node ring networks, which provides global connections for farer nodes with its higher layer. By doing so, a path complexity is reduced and a shorter path for farer nodes is obtained.

3. Each of the four-node ring networks contains a local arbiter to locally control the path selection. Thus, a global control is greatly reduced with low control complexity and high circuit flexibility.

4. The hierarchical ring architecture of four-node ring networks has a great localized prosperity for optimizing a practice of an on-chip network.

To sum up, the present invention is a circuit of an on-chip network having a four-node ring switch structure, where multiple data transfer in parallel with a dynamic configuration is provided between multiple processor cores or between multiple function units and register banks for obtaining a low control complexity, an optimized local bandwidth, optimized paths for far nodes, and a simplified circuit.

The preferred embodiment herein disclosed is not intended to unnecessarily limit the scope of the invention. Therefore, simple modifications or variations belonging to the equivalent of the scope of the claims and the instructions disclosed herein for a patent are all within the scope of the present invention. 

1. A circuit of an on-chip network having a four-node ring switch structure, comprising: two type-0 ring node for dual-directional data transfer, said type-0 ring node comprising three data input ports, three data output ports and five data transferring lines, said type-0 ring node transferring and switching data, two pairs of said data input ports and said data output ports are left side connection and right side connection data transfers, the other pair of said data input port and said data output port are input interface and output interface data transfers; and two type-1 ring node for dual-directional data transfer, said type-1 ring node comprising four data input ports, four data output ports and nine data transferring lines, said type-1 ring node transferring and switching data, two pairs of said data input ports and said data output ports transferring data are left side connection and right side connection data transfers, another pair of said data input port and said data output port are cross connection data transfers in said ring switch structure, the other pair of said data input port and said data output port are input interface and output interface data transfers.
 2. The circuit according to claim 1, wherein said type-0 ring node has a cut path.
 3. The circuit according to claim 1, wherein said type-1 ring node has a cross path.
 4. The circuit according to claim 1, wherein said circuit further has a hierarchical ring architecture to save nodes and to expand said circuit. 